WHAT IS CLAIMED IS: 

1. An encoding apparatus for encoding frame data 
containing image data and sound data, comprising: 

separating means for separating the image data 
and sound data contained in the frame data; 

image data encoding means for encoding the 
separated image data in sequence from a lower to a 
higher frequency component thereof, thereby generating 
image encoded data; 

sound data encoding means for encoding the 
separated sound data in sequence from a lower to a 
higher frequency component thereof, thereby generating 
sound encoded data; and 

frame encoded data generating means for 
generating header information by using the image 
encoded data and the sound encoded data, and generating 
frame encoded data by using the header information, the 
image encoded data, and the sound encoded data. 

2. The apparatus according to claim 1, wherein the 
header information contains at least one of the size of 
the image data, image type information of the image 
data, the length of the image encoded data, the length 
of the sound encoded data, identification information 
of said encoding apparatus, the transmission date and 
time, the start address of the image encoded data, and 
the start address of the sound encoded data, 

3. The apparatus according to claim 1, wherein said 
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image data encoding means generates a transform 
coefficient sequence for subbands by performing 
discrete wavelet transform for the image data, groups 
subbands of the same level, and sequentially encodes 
the transform coefficient sequence from a lower- to a 
higher-level subband, thereby generating the image 
encoded data. 

4. The apparatus according to claim 1, wherein said 
sound data encoding means generates a transform 
coefficient sequence for subbands by performing 
discrete wavelet transform for the sound data, groups 
subbands of the same level, and sequentially encodes 
the transform coefficient sequence from a lower- to a 
higher-level subband, thereby generating the sound 
encoded data. 

5. The apparatus according to claim 1, wherein said 
frame encoded data generating means generates the frame 
encoded data by arranging the header information, the 
image encoded data, and the sound encoded data in the 
order named. 

6. The apparatus according to claim 1, wherein said 
frame encoded data generating means generates the frame 
encoded data by grouping subbands of the same level in 
the image encoded data and the sound encoded data, and 
arranging the groups in ascending order of level 
following the header information. 

7. The apparatus according to claim 1, wherein said 



frame encoded data generating means generates the frame 
encoded data by using quasi-frame encoded data composed 
of a portion of the image encoded data and a portion of 
the sound encoded data. 

8 . An encoding apparatus for encoding frame data 
containing image data and sound data, comprising: 

separating means for separating the image data 
and the sound data contained in the frame data; 

image data encoding means for hierarchizing the 
image data into a plurality of types of image data and 
encoding the plurality of types of image data, thereby 
generating image encoded data corresponding to a 
plurality of levels; 

sound data encoding means for hierarchizing the 
sound data into a plurality of types of sound data and 
encoding the plurality of types of sound data, thereby 
generating sound encoded data corresponding to a 
plurality of levels; and 

frame encoded data generating means for 
generating frame encoded data by using the image 
encoded data and the sound encoded data, 

wherein said frame encoded data generating means 
generates the frame encoded data by forming a plurality 
of groups of different levels by grouping the image 
encoded data and sound encoded data belonging to the 
same level determined on the basis of a predetermined 
reference, and arranging the plurality of groups in 



descending order of significance level. 

9. The apparatus according to claim 8, wherein the 
plurality of types of image data hierarchized by said 
image data encoding means correspond to a plurality of 
frequency components obtained by discrete wavelet 
transform of the image data. 

10. The apparatus according to claim 8, wherein the 
plurality of types of sound data hierarchized by said 
sound data encoding means correspond to speech data 
which corresponds to a human voice and non-speech data 
other than the speech data. 

11. The apparatus according to claim 10, wherein said 
frame encoded data generating means groups encoded data 
of the speech data as sound encoded data of significant 
level together with first image encoded data, and 
groups encoded data of the non-speech data as sound 
encoded data of insignificant level together with 
second image encoded data. 

12. The apparatus according to claim 11, wherein 

the plurality of types of image data hierarchized 
by said image data encoding means contain a first 
frequency component obtained by discrete wavelet 
transform of the image data and a second frequency 
component higher than the first frequency component, 
and 

the first and second image encoded data 
correspond to the first and second frequency components, 



respectively. 

13. The apparatus according to claim 8, wherein the 
plurality of types of sound data hierarchized by said 
sound data encoding means correspond to speech data 
which corresponds to a human voice and not less than 
two non-speech data obtained by hierarchizing 
non-speech data other than the speech data. 

14. The apparatus according to claim 12, wherein said 
frame encoded data generating means 

groups encoded data of the speech data as sound 
encoded data of most significant level together with 
the first image encoded data, 

groups encoded data of first non-speech data 
obtained by hierarchizing the non-speech data, as sound 
encoded data of level significant next to the most 
significant level, together with the second image 
encoded data, and 

groups encoded data of second non-speech data 
other than the first non-speech data, obtained by 
hierarchizing the non-speech data, together with third 
image encoded data. 

15. The apparatus according to claim 14, wherein the 
plurality of types of image data hierarchized by said 
image data encoding means contain a first frequency 
component obtained by discrete wavelet transform of the 
image data, a second frequency component higher than 
the first frequency component, and a third frequency 



component higher than the second frequency component, 
and 

the first, second, and third image encoded data 
correspond to the first, second, and third frequency 
components , respectively . 

16. The apparatus according to claim 8, wherein said 
frame encoded data generating means groups the image 
encoded data and the sound encoded data by ' selectively 
using a plurality of types of grouping methods. 

17. The apparatus according to claim 16, wherein the 
plurality of types of grouping methods include a 
grouping method which gives priority to image quality 
and a grouping method which gives priority to sound 
quality . 

18. The apparatus according to claim 16, further 
comprising : 

transmitting means for transmitting the frame 
encoded data; 

detecting means for detecting a decoding status 
of the transmitted frame encoded data; and 

control means for switching the grouping methods 
in accordance with the detected decoding status. 

19. An encoding method of encoding frame data 
containing image data and sound data, comprising: 

the separating step of separating the image data 
and the sound data contained in the frame data; 

the image data encoding step of encoding the 



separated image data in sequence from a lower to a 
higher frequency component thereof, thereby generating 
image encoded data; 

the sound data encoding step of encoding the 
separated sound data in sequence from a lower to a 
higher frequency component thereof, thereby generating 
sound encoded data; and 

the frame encoded data generating step of 
generating header information by using the image 
encoded data and the sound encoded data, and generating 
frame encoded data by using the header information, the 
image encoded data, and the sound encoded data. 
20. An encoding method of encoding frame data 
containing image data and sound data, comprising: 

the separating step of separating the image data 
and the sound data contained in the frame data; 

the image data encoding step of hierarchizing the 
image data into a plurality of types of image data and 
encoding the plurality of types of image data, thereby 
generating image encoded data corresponding to a 
plurality of levels; 

the sound data encoding step of hierarchizing the 
sound data into a plurality of types of sound data and 
encoding the plurality of types of sound data, thereby 
generating sound encoded data corresponding to a 
plurality of levels; and 

the frame encoded data generating step of 



generating frame encoded data by using the image 
encoded data and the sound encoded data, 

wherein the frame encoded data generating step 
generates the frame encoded data by forming a plurality 
of groups of different levels by grouping the image 
encoded data and sound encoded data belonging to the 
same level determined on the basis of a predetermined 
reference, and arranging the plurality of groups in 
descending order of significance level. 
21. A program which, when executed by a computer, 
allows the computer to function as an encoding 
apparatus for encoding frame data containing image data 
and sound data, comprising: 

a code of the separating step of separating the 
image data and the sound data contained in the frame 
data; 

a code of the image data encoding step of 
encoding the separated image data in sequence from a 
lower to a higher frequency component thereof, thereby 
generating image encoded data; 

a code of the sound data encoding step of 
encoding the separated sound data in sequence from a 
lower to a higher frequency component thereof, thereby 
generating sound encoded data; and 

a code of the frame encoded data generating step 
of generating header information by using the image 
encoded data and the sound encoded data, and generating 



frame encoded data by using the header information, the 
image encoded data, and the sound encoded data. 
22. A program which, when executed by a computer, 
allows the computer to function as an encoding 
apparatus for encoding frame data containing image data 
and sound data, comprising: 

a code of the separating step of separating the 
image data and the sound data contained in the frame 
data; 

a code of the image data encoding step of 
hierarchizing the image data into a plurality of types 
of image data and encoding the plurality of types of 
image data, thereby generating image encoded data 
corresponding to a plurality of levels; 

a code of the sound data encoding step of 
hierarchizing the sound data into a plurality of types 
of sound data and encoding the plurality of types of 
sound data, thereby generating sound encoded data 
corresponding to a plurality of levels; and 

a code of the frame encoded data generating step 
of generating frame encoded data by using the image 
encoded data and the sound encoded data, 

wherein the frame encoded data generating step 
generates the frame encoded data by forming a plurality 
of groups of different levels by grouping the image 
encoded data and sound encoded data belonging to the 
same level determined on the basis of a predetermined 
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reference, and arranging the plurality of groups in 
descending order of significance level. 
23. A recording medium recording the program 
according to claim 21. 
5 24. A recording medium recording the program 
according to claim 22. 
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